Bioinformatics of Brain Diseases

203

FIGURE 8.2

Total number of experiments in the GEO repository for brain diseases and

disorders and total number of genes associated with the studied diseases and

disorders from the DisGeNET database (As of August 2023).

biological information. Most manufacturers provide analysis tools along their

microarray or RNA-seq products. However, there are also open-source tools

that include various methods in analyzing the data.

Bioconductor is an open-source software based on R programming lan-

guage that helps analyze genomic data (both microarray and RNA-seq) gen-

erated by wet lab experiments (https://www.bioconductor.org) [23].

It is

essentially a repository of R packages. There are currently 3593 packages in

its environment which are mostly software packages (2230) but there are also

annotation (912) and experimental data (421) packages as well as workflow

packages (30) (as of August 2023). Here, we can find genomic data analy-

sis packages like LIMMA (linear models for microarray data), an algorithm

that uses RMA (Robust Multi-array Average) and other normalization tech-

niques to account for data noise before using a linear model to determine the

differential expression of genes [24]. In addition to LIMMA, there are other

packages in Bioconductor that are used in analyzing RNA-seq and microarray

data such as EdgeR and DESeq2 [25]. EdgeR is a package that uses a Poisson

model to include both biological and technical variations [26]. Shrinkage esti-

mation for dispersions and fold changes are used in the DESeq2 approach to

improve estimate stability and perception [27]. As previously stated, these and

other packages implement a variety of statistical methodologies for differential

analysis. Once the analysis results are out there, we can identify significant

differential expression through the use of several cut offs such as p-value’s and

fold changes.